334 research outputs found

    PhD-SNPg: a webserver and lightweight tool for scoring single nucleotide variants

    Get PDF
    One of the major challenges in human genetics is to identify functional effects of coding and non-coding single nucleotide variants (SNVs). In the past, several methods have been developed to identify disease-related single amino acid changes but only few tools are able to score the impact of non-coding variants. Among the most popular algorithms, CADD and FATHMM predict the effect of SNVs in non-coding regions combining sequence conservation with several functional features derived from the ENCODE project data. Thus, to run CADD or FATHMM locally, the installation process requires to download a large set of pre-calculated information. To facilitate the process of variant annotation we develop PhD-SNPg, a new easy-to-install and lightweight machine learning method that depends only on sequence-based features. Despite this, PhD-SNPg performs similarly or better than more complex methods. This makes PhD-SNPg ideal for quick SNV interpretation, and as benchmark for tool development

    Network measures for protein folding state discrimination

    Get PDF
    Proteins fold using a two-state or multi-state kinetic mechanisms, but up to now there is not a first-principle model to explain this different behavior. We exploit the network properties of protein structures by introducing novel observables to address the problem of classifying the different types of folding kinetics. These observables display a plain physical meaning, in terms of vibrational modes, possible configurations compatible with the native protein structure, and folding cooperativity. The relevance of these observables is supported by a classification performance up to 90%, even with simple classifiers such as discriminant analysis

    Synergistic toxicity of some sulfonamide mixtures on Daphnia magna

    Get PDF
    In livestock farming, sulfonamides (SAs) are used prophylactically and simultaneously in large numbers of animals. Therefore, traces of these compounds, alone or in combination, have been repeatedly detected in the environment. Synergistic interactions among chemicals in such mixtures represent an area of concern for the regulatory authorities. In this study, the acute toxic effects of binary and ternary mixtures of SAs were evaluated in Daphnia magna, in order to verify whether, based on their individual toxicity, they jointly exert a larger effect than would be predicted by individual actions alone. First, following the Concentration Addition (CA) principle, some preliminary observations were made by testing a number of drug combinations with an expected 50% effect. Then, mixtures more recognised for their synergistic effect (four binary and two ternary) were assayed in a range of reducing concentrations. The data acquired were processed using CompuSyn software, which integrates the different shape of the curves obtained in calculating the Combination Index (CI) for the evaluation of synergistic effects. For binary mixtures, synergy was also evaluated using the curvilinear isobologram method for heterodynamic drugs. Results indicate that most of the selected mixtures exhibit a synergistic effect using the CI methodology. For binary mixtures, these findings were also confirmed by isobologram analysis. Detected synergies indicate that the CA is not always precautionary as a reference model for the evaluation of the aquatic toxicity of SAs mixtures

    The posterior-Viterbi: a new decoding algorithm for hidden Markov models

    Full text link
    Background: Hidden Markov models (HMM) are powerful machine learning tools successfully applied to problems of computational Molecular Biology. In a predictive task, the HMM is endowed with a decoding algorithm in order to assign the most probable state path, and in turn the class labeling, to an unknown sequence. The Viterbi and the posterior decoding algorithms are the most common. The former is very efficient when one path dominates, while the latter, even though does not guarantee to preserve the automaton grammar, is more effective when several concurring paths have similar probabilities. A third good alternative is 1-best, which was shown to perform equal or better than Viterbi. Results: In this paper we introduce the posterior-Viterbi (PV) a new decoding which combines the posterior and Viterbi algorithms. PV is a two step process: first the posterior probability of each state is computed and then the best posterior allowed path through the model is evaluated by a Viterbi algorithm. Conclusions: We show that PV decoding performs better than other algorithms first on toy models and then on the computational biological problem of the prediction of the topology of beta-barrel membrane proteins.Comment: 23 pages, 3 figure

    Embedding machine-readable proteins interactions data in scientific articles for easy access and retrieval

    Get PDF
    Extraction of protein-protein interactions data from scientific literature remains a hard, time- and resource-consuming task. This task would be greatly simplified by embedding in the source, i.e. research articles, a standardized, synthetic, machine-readable codification for protein-protein interactions data description, to make the identification and the retrieval of such very valuable information easier, faster, and more reliable than now.
We shortly discuss how this information can be easily encoded and embedded in research papers with the collaboration of authors and scientific publishers, and propose an online demonstrative tool that shows how to help and allow authors for the easy and fast conversion of such valuable biological data into an embeddable, accessible, computer-readable codification

    PhD-SNPg: updating a webserver and lightweight tool for scoring nucleotide variants

    Get PDF
    One of the primary challenges in human genetics is determining the functional impact of single nucleotide variants (SNVs) and insertion and deletions (InDels), whether coding or noncoding. In the past, methods have been created to detect disease-related single amino acid changes, but only some can assess the influence of noncoding variations. CADD is the most commonly used and advanced algorithm for predicting the diverse effects of genome variations. It employs a combination of sequence conservation and functional features derived from the ENCODE project data. To use CADD, a large set of pre-calculated information must be downloaded during the installation process. To streamline the variant annotation process, we developed PhD-SNPg, a machine-learning tool that is easy to install and lightweight, relying solely on sequence-based features. Here we present an updated version, trained on a larger dataset, that can also predict the impact of the InDel variations. Despite its simplicity, PhD-SNPg performs similarly to CADD, making it ideal for rapid genome interpretation and as a benchmark for tool development

    SChloro: directing Viridiplantae proteins to six chloroplastic sub-compartments

    Get PDF
    Motivation: Chloroplasts are organelles found in plants and involved in several important cell processes. Similarly to other compartments in the cell, chloroplasts have an internal structure comprising several sub-compartments, where different proteins are targeted to perform their functions. Given the relation between protein function and localization, the availability of effective computational tools to predict protein sub-organelle localizations is crucial for large-scale functional studies. Results: In this paper we present SChloro, a novel machine-learning approach to predict protein sub-chloroplastic localization, based on targeting signal detection and membrane protein information. The proposed approach performs multi-label predictions discriminating six chloroplastic sub-compartments that include inner membrane, outer membrane, stroma, thylakoid lumen, plastoglobule and thylakoid membrane. In comparative benchmarks, the proposed method outperforms current state-of-the-art methods in both single-and multi-compartment predictions, with an overall multi-label accuracy of 74%. The results demonstrate the relevance of the approach that is eligible as a good candidate for integration into more general large-scale annotation pipelines of protein subcellular localization

    NET-GE: a novel NETwork-based Gene Enrichment for detecting biological processes associated to Mendelian diseases

    Get PDF
    Enrichment analysis is a widely applied procedure for shedding light on the molecular mechanisms and functions at the basis of phenotypes, for enlarging the dataset of possibly related genes/proteins and for helping interpretation and prioritization of newly determined variations. Several standard and Network-based enrichment methods are available. Both approaches rely on the annotations that characterize the genes/proteins included in the input set; network based ones also include in different ways physical and functional relationships among different genes or proteins that can be extracted from the available biological networks of interactions

    In silico evidence of the relationship between miRNAs and siRNAs

    Full text link
    Both short interfering RNAs (siRNAs) and microRNAs (miRNAs) mediate the repression of specific sequences of mRNA through the RNA interference pathway. In the last years several experiments have supported the hypothesis that siRNAs and miRNAs may be functionally interchangeable, at least in cultured cells. In this work we verify that this hypothesis is also supported by a computational evidence. We show that a method specifically trained to predict the activity of the exogenous siRNAs assigns a high silencing level to experimentally determined human miRNAs. This result not only supports the idea of siRNAs and miRNAs equivalence but indicates that it is possible to use computational tools developed using synthetic small interference RNAs to investigate endogenous miRNAs.Comment: 8 pages, 2 figure
    • …
    corecore